Due Jun 14, 2:59 AM EDT
If searching among a large number of hyperparameters, you should try values in a grid rather than random values, so that you can carry out the search more systematically and not rely on chance. True or False?
Every hyperparameter, if set poorly, can have a huge negative impact on training, and so all hyperparameters are about equally important to tune well. True or False?
Yes. We've seen in lecture that some hyperparameters, such as the learning rate, are more critical than others.
During hyperparameter search, whether you try to babysit one model (“Panda” strategy) or train a lot of models in parallel (“Caviar”) is largely determined by:
If you think β (hyperparameter for momentum) is between 0.9 and 0.99, which of the following is the recommended way to sample a value for beta?
Finding good hyperparameter values is very time-consuming. So typically you should do it once at the start of the project, and try to find very good hyperparameters so that you don’t ever have to revisit tuning them again. True or false?
In batch normalization as presented in the videos, if you apply it on the lth layer of your neural network, what are you normalizing?
In the normalization formula znorm(i)=σ2+εz(i)−μ, why do we use epsilon?
Which of the following statements about γ and β in Batch Norm are true?
After training a neural network with Batch Norm, at test time, to evaluate the neural network on a new example you should:
Which of these statements about deep learning programming frameworks are true? (Check all that apply)